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We adapt the time-evolving block decimation (TEBD) algorithm, originally devised to simulate the dynamics 
of ID quantum systems, to simulate the time-evolution of non-equilibrium stochastic systems. We describe this 
method in detail; a system's probability distribution is represented by a matrix product state (MPS) of finite 
dimension and then its time-evolution is efficiently simulated by repeatedly updating and approximately re- 
factorizing this representation. We examine the use of MPS as an approximation method, looking at parallels 
between the interpretations of applying it to quantum state vectors and probability distributions. In the context 
of stochastic systems we consider two types of factorization for use in the TEBD algorithm: non-negative matrix 
factorization (NMF), which ensures that the approximate probability distribution is manifestly non-negative, and 
the singular value decomposition (SVD). Comparing these factorizations we find the accuracy of the SVD to 
be substantially greater than current NMF algorithms. We then apply TEBD to simulate the totally asymmetric 
simple exclusion process (TASEP) for systems of up to hundreds of lattice sites in size. Using exact analytic 
results for the TASEP steady state, we find that TEBD reproduces this state such that the error in calculating 
expectation values can be made negligible, even when severely compressing the description of the system by 
restricting the dimension of the MPS to be very small. Out of the steady state we show for specific observables 
that expectation values converge as the dimension of the MPS is increased to a moderate size. 

PACS numbers: 02.50.-r, 02.70.-c, 03.67.Mn 



I. INTRODUCTION 

Non-equilibrium stochastic systems have attracted interest 
for two key reasons. Firstly, they exhibit a wealth of non- 
trivial phenomena, not observed in equilibrium systems, such 
as boundary-induced phase transitions 1 1 J and infinite range 
correlations |2|. Secondly, systems far away from equilib- 
rium have applications in describing a variety of driven dif- 
fusive processes, including vehicular traffic flow lHJjSl and 
biological transport ||6H9l. There exists a range of analyti- 
cal approaches with which to address such systems, including 
matrix product methods ITOl [TTl and the Bethe ansatz 1(121 . 
However, there is no complete analytical framework for non- 
equilibrium systems, especially away from the steady state, 
which increases the importance of numerical techniques. 

To simulate these systems, we adapt the time-evolving 
block decimation (TEBD) algorithm, a well established nu- 
merical technique for studying the time-evolution of quan- 
tum systems liT3l [T4l . TEBD has been used to compute 
the dynamics of spin chains lfT4l4T6]| and the Bose Hubbard 
model WT\ ITSl . which describes an array of ID quantum 
systems including cold atoms in optical lattices |19|. The 
algorithm provides access to various dynamical properties 
like atomic many-body currents fSOl, non-linear responses to 
driving fields [21] and the formation and transport of quasi- 
particles Il22ll23]| . The adaption we present here simulates the 
time-evolution of non-equilibrium stochastic systems, and we 
refer to it as classical TEBD (cTEBD). 

The principle method used to simulate non-equilibrium sys- 
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tems is dynamic Monte Carlo (DMC), first introduced by |24|, 
and two recent examples can be found in |25 26 1. cTEBD 
provides a means of calculating expectation values in a fun- 
damentally different way to DMC simulations. The latter av- 
erages over observable values calculated from many trajecto- 
ries, each a possible evolution of the system, while here we 
evolve an approximation to the probability distribution of the 
system from which observable values are calculated. The mo- 
tivation is to provide a complimentary method to DMC with 
a contrasting source of error; rather than insufficient sampling 
we deal with errors arising from approximating a probability 
distribution by a matrix product state (MPS) i27H30i . 

Related to TEBD and MPS, a family of numerical tech- 
niques originally devised to calculate ground state proper- 
ties of quantum systems, and later applied to classical non- 
equilibrium systems, is the density matrix renormalisation 
group (DMRG) methods ll3Tti33l . The application of DMRG 
methods to stochastic non-equilibrium systems in f34l j36l 
suggest that probability distributions of many systems may 
be approximated well by MPS of small dimension. How- 
ever, DMRG methods may only be applied to steady states 
of stochastic systems, not general dynamics which we ad- 
dress in this paper. Also, as shown in ll35l for the reaction- 
diffusion model, they face numerical instabilities for large sys- 
tems, arising from non-Hermitian diagonalization. 

A key premise that motivates us to adapt the quantum 
TEBD algorithm is that a large class of Markovian stochas- 
tic systems evolve according to a Schrodinger-like equa- 
tion ll37l l38l . hi this case the evolution is in imaginary 
time and governed by a non-Hermitian stochastic Hamilto- 
nian. We demonstrate how this similarity between quan- 
tum and stochastic evolution can be exploited by applying 
cTEBD to the totally asymmetric simple exclusion process 
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FIG. 1 : (Color online) The TASEP is a paradigm of stochastic non- 
equilibrium systems. Three Poisson processes contribute to the dy- 
namics of the system. Particles are injected into the first site with rate 
a if it is empty and ejected from the A'-th site with rate (3 if occupied. 
Finally, particles in the £-th site hop to the (£ + l)-th site with a rate 
Y if that site is empty, resulting in flow from left to right. 



equilibrium systems. 

The structure of the paper is as follows: We review MPS, 
their construction and physical interpretation in Sec. 
Next, in Sec. (|IIl| we show how MPS can be evolved effi- 
ciently and discuss the properties of two factorization methods 
for use in cTEBD, namely NMF and SVD. The TASEP and 



its mathematical representation are introduced in Sec. (IV i. In 
Sec. (jV]i we compare the performance of cTEBD with NMF 
and SVD, before focusing on the errors of the SVD-based al- 
gorithm for small systems. We then go on to apply cTEBD to 



the study of larger systems in Sec. ( VIi before presenting our 



conclusions and outlook for future work in Sec. ( |VII[ ) 



(TASEP) Il39lll0|. The TASEP shown schematically in Fig.[T] 
is the quintessential driven particle hopping system. Such sys- 
tems share the following features: particles occupy a ID lat- 
tice; the boundary sites are connected to particle reservoirs, 
so particles are injected and ejected there; and a hopping pro- 
cess causes particles to move through the lattice (see flni42l 
for reviews). Simple particle hopping systems have drawn 
much attention, like the Ising and Heisenberg models of equi- 
librium statistical physics |43|, because they are archetypal 
of non-equilibrium physics. Importantly, the TASEP has a 
Hamiltonian that consists of single-site and two-site nearest- 
neighbor terms only lfT0ll44l . Using this, we will outline how 
the cTEBD algorithm approximately time-evolves a probabil- 
ity distribution by the repeated application of two-site nearest- 
neighbor operators, each of which affects only the matrices 
of the MPS associated with those two sites. After each two- 
site operation the algorithm ensures that the description of the 
system remains compressed by using a matrix factorization, 
returning it to an MPS. We have adapted the TEBD to allow 
for a range of factorizations to be used as part of the algo- 
rithm, and choosing which factorization is the key element in 
determining the computational cost and accuracy of cTEBD. 

The two candidates we focus on are the singular value de- 
composition (SVD) |45| and the non-negative matrix factor- 
ization (NMF) 1 46, 47 1, which by restricting the matrices of 
the MPS to be non-negative (called an sMPS) allows for an 
information theoretic interpretation, as recently investigated 
in |48|. We compare the performance of these two factor- 
izations and find that while SVD algorithms are well devel- 
oped, stable and efficient, current NMF algorithms are unable 
to identify accurate factorizations, due to the non-convexity of 
the task. 

As a result we focus on the SVD-based cTEBD algorithm 
and investigate its performance in simulating the TASEP. We 
look in detail at the two main sources of error: the Trotter 
error, which is due to approximating the evolution over each 
time-step by a sequence of two-site operations, and the trun- 
cation error, due to compressing the state to an MPS of small 
dimension. For small systems we show how these errors be- 
have, as well as some bounds on them, and then analyze how 
these results scale with the system size up to hundreds of lat- 
tice sites. The results obtained promise that cTEBD could be 
applicable to and accurate in simulating a wide range of non- 



II. MATRIX PRODUCT STATES 

Matrix product states are essentially a special structure 
for approximating vectors in extremely high dimension vec- 
tor spaces, which typically arise from the tensor product of 
smaller vector spaces. Since our use of MPS for stochastic 
systems is motivated by and in many ways analogous to their 
use in ID quantum systems, we begin by introducing MPS 
first for quantum state vectors ifTSl |49l l50l and then in the 
context of probability distributions. 



A. Quantum systems 

To illustrate MPS in their most general form we consider 
a quantum system composed of sites each with a local d- 
dimensional Hilbert space. An arbitrary quantum state of 
this system \\\t), normalized according to the L2-iiomi as 
1 1 1 \(/) 1 1 2 = 1 , can be expressed as 

IV) =LVi|i), 

where i ~ {iiih,' ■ ■ Jn) is an N-tuple of local indices 
it = 0,- ••,£/ — 1 specifying a complete configuration and 
{| i) = I /i) I /2) • • • I In)} are the set of orthonormal configura- 
tion states. Since | \\t) inhabits the tensor product vector space 
((D^)®^, whose dimension grows exponentially with as d^, 
an exact description of its amplitudes rapidly becomes in- 
tractable with increasing A^. To attempt to overcome this scal- 
ing, we expand \\t\ as 

V,/. =aW'iaP1'2---A['^1«, (1) 

where for each i( the factor aI^''* is a complex matrix 
d^Zf-ixZf^ aside from the boundary terms aI'I'i and aI'^I'" 
which are instead complex vectors (D^^''' and <CX-n-i^ \ re- 
spectively fTT). The expansion in Eq. ([T]i is simple to visual- 
ize; each site £, depending on its local configuration if, con- 
tributes a matrix A t^l''' to a site-ordered multiplication of matri- 
ces which results in the amplitude \\ti. The boundary vectors 
then ensure that a scalar is recovered. Such an MPS can be 
thought of as a tensor network, as depicted in Fig.|2] A prod- 
uct state, whose MPS requires dimensions Xc = 1 for all £ 
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FIG. 2: (Color online) An MPS for a vector | \(/) as a comb-like tensor 
network. Each site £ of the system has a tensor A 1^1 associated with it, 
illustrated by the shaded circles. The thick vertical leg of a tensor aI'I 
represents the physical index i(, while the thinner leg(s) portray the 
internal index (indices). The boundary tensors have a single internal 
leg and the others have two. The joining of tensor legs represents the 
contraction (or matrix multiplication) over the internal indices. 



boundary connecting the two parts up to logarithmic cor- 
rections at criticality, and area laws therefore severely limit 
the entanglement entropy physical systems can possess. For 
quantum spin chains with short-range interactions the area law 
is manifested in an extremely rapid decay of the Schmidt co- 
efficients {Xlf'l} [14 1 



An immediate consequence of this is 

of |¥> 

can be formed by truncating the sum in Eq. (|2| to the first % 
terms lfT4l as 



that an accurate (un-normalized) approximation | Vj/I^' 



has no correlations between different sites, while the MPS de- 
scribing a vector with arbitrarily strong correlations may need 
matrix dimensions up to Xf = min (t/^jt/'^^^), and so scale 
exponentially with the system size A^. Thus in describing ar- 
bitrary vectors an MPS description offers no advantage. 

The utility of an MPS formulation, however, is based pre- 
cisely on the fact that physical states have a structure which al- 
lows for a near exact MPS description while restricting %( < % 
with X much smaller than the largest X(. For quantum sys- 
tems, coiTelations in a many-body state \\\t) are quantified 
by the entanglement entropy and this in turn has a rigorous 
connection to the required MPS dimension % [13. ,51 1. This 
is shown by first splitting the system into two contiguous 
blocks lI^I ={ 1 ,•••,€} and rM = {£ + 1,- ■ ■ ,N}. The state 
is then expanded as |\[/) = Lij¥ij|i)L|j)R, where i and j la- 
bel configurations of the two subsystems and \(/ is now the 
reshaped d'' x d^^'^ matrix of amplitudes. To expose the cor- 
relations of this state an SVD [45 [ is performed which fac- 
torizes \(/ = UDV\ where U and V are unitary matrices and 
D is a diagonal matrix with real non-negative elements. This 
establishes a so-called Schmidt decomposition ll52l[ of | \(/) as 



l¥) 



(2) 



Here {|L|f')} and {|R|f')} are Schmidt states derived from 

the columns of U and V , while {Xjf' } are the singular values 
(Schmidt coefficients in this context) specified by the diagonal 
elements of D, arranged in non-increasing order with /j. The 
normalization of | \(/) and unitarity of U and V result in 



:(A|:1)2 = 1 and (L[fl|L[^l) = (R[fl|R[:^- 



The mutual information between the subsystems Lt^l and rI^I 
is equal to twice the entanglement entropy [53], given by 



5LR = -i:(Xlfl)2log [Xl 



The size of the entanglement entropy between a subsystem 
and the rest of the system for ground states has been ex- 
tensively studied in the context of area laws (see [|54l for a 
review). These have shown that S'lr should scale with the 



The error made in computing the expectation value of any 
quantum mechanical observable with rather than |\|/) 

is bounded by the L2-norm of the residual f5T\, given by 



|v)-|VfrM) 



(A,|f')^]^/^ and is therefore small if 



1 2 - \X^^=x+iy"-i' 



the sum of the squares of the truncated Schmidt coefficients 
is small [ 14 [. These observations can be directly related to an 
MPS, as described in Appendix [A} by applying this decompo- 
sition and truncation in an iterative sequence to all contiguous 
bipartitions to form an approximation [13 [. This approxi- 
mation is an MPS of the form of Eq. ([T]) but with a maximum 
matrix dimension %. The errors in this case are bounded by 
the singular values [51] as 



l¥)-|¥)l 



< 



2 L L (xjfO 



Ma2 



-,1/2 



(3) 



A rapid decay of {A-|f } with /j for each bipartition thus enables 
% to be kept small while retaining a high accuracy, and results 
in an MPS description parameterized by 0{Nd'/}) complex 
numbers llT4l . This near lossless compression of the infor- 
mation in {¥i} for physically relevant vectors is responsible 
for the enormous success of the MPS approach used within 
DMRG imim and TEBD methods [[131 [T4l . 



B. Classical systems 

The general MPS formalism introduced can equally be ap- 
plied to stochastic classical systems. We now consider a 
system of sites each with d local configurations. The 
state of the system is then described by a probability vector 
I P) ~ Li^il i) with d^ real non-negative components {P\} cor- 
responding to the probabilities of the system being in each of 
the configurations {i}. Classical states \P) are therefore con- 
tained in the positive orthant of the vector space (R'')®'^ and 
are normalized according to the Li-norm as HlP)!!! — 1. An 
immediate restriction to real matrices {aI^''' } within the MPS 
expansion in Eq. ([T} can be made in this case. A major ques- 
tion which we will address in this work is whether it is nec- 
essary, or indeed desirable, to further restrict the matrices to 
be non-negative. To do so manifestly ensures that the sMPS 
approximation to | P) never leaves the positive orthant. Early 
exact analytical calculations based on an MPS formalism were 

sMPS noi. 
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For classical systems a similar link between the dimension 
% required and correlations in | P) can be established by work- 
ing exclusively with non-negative quantities. As before, we 
bipartition the system after site £ into two contiguous blocks 
and and then expand |P) = Liji^j|i)L|j>R- Following 
Temme and Verstraete in BSl . we may draw a close analogy 
to the Schmidt form of a quantum state via a decomposition 
of P as 



(4) 



which has the elegant interpretation of having } as proba- 
bilities, again arranged in non-increasing order with /j, which 
mix together the conditional marginal probability distribu- 
tions Pi ' and P^ . This could equally be expressed as the 
vector decomposition 



1^) 



Xe 



(5) 



where 



and |Rjf') have elements {^^'^ (i|/j)} and 

{^r VJIa')} respectively, similar in form to the Schmidt de- 
composition in Eq. (j2| except that | P) is expanded in terms 
of non-negative vectors rather than orthonormal ones. In this 
case the mutual information is upper-bounded by the entropy 
of the mixing probability distribution ll48l . given by 



S{{pfi^}) = -Lp^^iog 



Akin to the truncation of a Schmidt decomposition, 
an (un-normalized) approximation to \P) can be 

formed by truncating the decomposition in Eq. Q to 
the first % terms, while the Li-norm of the residual is 



l.f=X+^P^^'^■ In Appendix 



B 



we show that 



\\\P)-\P^'^)\\i 

for classical systems the error made in computmg any observ- 
able is upper-bounded by the Li-norm error of the probability 
vector, analogous to the role of the L2-distance for quantum 
systems. The decomposition Eq. (j4]) is in general non-unique 
and the most significant decomposition at this bipartition is 
the one with the spectrum {p|f'} that has the smallest entropy 
Sc, called the entropy cost |48 1. This suggests that the entropy 
cost Sc should be considered analogous to the entanglement 
entropy S'lr since it quantifies both the quality of approxima- 
tion and correlations. However, since Sc only upper-bounds 
the mutual information the connection between correlations 
and truncation errors is substantially weaker than that seen in 
the quantum case. Just like for quantum systems, repeated ap- 
plication of the decomposition in Eq. (j5]l at all bipartitions can 
be used to build an approximate sMPS | P) , and in 1481 it was 
shown that the error is bounded by 



N- 1 * 



l^')-l^)lli< E E z'. 

e=i fj=x+i 



Exact sMPS solutions for TASEP stationary states find that the 
entropy minimizing spectrum decays very quickly |48 1 thus 
indicating in a similar way to the quantum case that a small % 
can be chosen while maintaining an accurate description. 

In this work we note that each decomposition Eq. Q is in 
fact precisely equivalent to an exact NMF of the matrix P, 
analogous to performing an SVD in the quantum case. The 
exact NMF can be written as P = UDV^ where the columns of 
U and V are the normalized non-negative probability vectors 
in the decomposition Eq. (|5]l and the diagonal elements of D 
are the normalized probabilities {pjf'} [55 J . As was done with 
the SVD, I pM) is formed by truncating the inner-dimension of 
the NMF to %. 

While the NMF may be a more natural choice for decom- 
posing probability vectors at a formal level, it is perfectly pos- 
sible to employ S VDs on | P) in precisely the same way as for 
a quantum state. The L2-norm error is given by Eq. Q, and 
the L2-norm upper-bounds the more relevant Li-norm with 
a factor of the square root of the dimension of the vector 
space [45 1, hence for the approximation | P) built in this way 



\P)-\P)\U < A = 2^/2 



N- 1 Xf 

2E E 



1/2 



(6) 



The SVD approach has two pleasing features: firstly, it 
is unique, and secondly, given \P) = Lij^j|i)L|j)R and 
\Q) — Lij2ij|i)L|j)R, a well known result first proposed by 
Eckart and Young ||56l is that | pl^l) is the solution to the opti- 
mization problem ||451 



(7) 



rank(e)<)c 



for any 1 < X < X(. Thus the L2-norm error of a truncated 
SVD will always be less than that of a truncated NMF or any 
other factorization of the same rank. This ensures that if there 
exists an exact MPS of some dimension % then there also ex- 
ists an exact SVD-MPS of dimension % or less. 

A powerful feature of the SVD is the orthogonality con- 
straint on the singular vectors (the columns of U and V). How- 
ever, this typically results in the U and V matrices having el- 
ements of mixed signs, even for non-negative P, and so does 
not permit any obvious information theoretic interpretation of 
the components of the decomposition, as exists for an NMF. 
This also means that when | P) is truncated to form | P^^^), neg- 
ative probabilities may enter our description. The maximum 
negativity of any element of is 



max I 



pI-^\<\\\p) 



< 



Xe 

E 



Thus while non-negativity is not conserved it is controlled by 
a similar criterion to the accuracy of the factorization, namely 
the smallness of the truncated singular values. This bound 
is loose, for example in the worst case limit of truncation to 
X = 1 non-negativity is in fact preserved due to the Perron- 
Frobenius theorem |57| which ensures that both the left and 
right singular vectors associated to the largest singular value 
of a non-negative matrix are themselves non-negative. 



III. TIME-EVOLVING BLOCK DECIMATION 

We have seen that MPS can provide an accurate and ef- 
ficient description of some probability distributions. Our 
purpose for this section is to review the TEBD algorithm, 
originally devised to update a quantum MPS under unitary 
time-evolution lfT3l 1741 1181. but now paying specific atten- 
tion to issues arising from its use on classical stochastic sys- 
tems. Analogous to quantum systems, the evolution of a large 
class of stochastic systems, for a time f, is performed by 
|P(f)) = e\p{Ht)\P) ||321|38l. The evolution in this case is 
stochastic and // is a stochastic Hamiltonian, the properties of 
which will be discussed in Sec. jlV) . 



A. Time-evolving the state 

For anything but the smallest systems neither | P) nor the 
full stochastic evolution operator exp{Ht) can be represented 
exactly. Having applied an MPS to tackle the description of 
\P) we now follow the standard approach used for similar 
quantum evolutions lil4J . namely breaking it up into n small 
time-steps 5f = t/n as exp(i/f) = HyLi exp(//5f) and then 
splitting exp(//5f) into a product of two-site operators. To 
do the latter we first restrict ourselves to considering stochas- 
tic Hamiltonians H composed of a sum of nearest-neighbor 
two-site terms {hci^i} as 



(a) 



H 



N-l 



where we have incorporated any single-site terms into 
{he^+i}. The TASEP is in this restricted class flOlBl. We 
can proceed to split up exp(//5f) via a second-order Suzuki- 
Trotter expansion 1581 as 




(b) (i) 

(iii) K (iv) U D 

(V) a'Ma'[^+i 



FIG. 3: (Color online) (a) A circuit diagram showing the second- 
order Suzuki-Trotter expansion used in this work. One time-step 
exp(/f5f) is approximated by applying two-site stochiastic operators 
Se between each nearest-neighbor pair of sites, sweeping first from 
left to right and then back again, (b) A schematic tensor diagram 
of the core procedure in the TEBD algorithm 1131 1141 for applying 
a two-site operator, described in the text. Firstly, in (i) the physical 
legs of operator Sf are contracted with the relevant legs of the ten- 
sors {A"} in the MPS. This multiplies the state by 5^- and involves 
only the tensors shaded. The result of this is a two-site order-4 tensor 
shown in (ii). Next in (iii) the tensor is reshaped as a matrix, 
resulting in two so-called fat legs. The most computationally expen- 
sive step is (iv) where an approximate matrix factorization of 0, of 
the form = UDV^ is computed. Finally in (v) the reshaping of 
U and DV^ is equivalent to unravelling their legs to form the newly 
updated A'l^l andA'I^+^l tensors. The diagonal matrix D is absorbed 
into the during the left to right sweep, as shown by the shading in 
(iv), and into the U for the opposite direction. The A'I'l tensor con- 
structed from an orthogonal matrix only, automatically obeys one of 
the orthonormality constraints detailed in Appendix [A] Which way 
the D is absorbed then follows from requiring that the tensor for the 
next two-site operation has indices which correspond to orthonormal 
bases, as discussed in Sec. (jnTCj. 



^H5t 




1&/2 



0(5f3), (8) 



which represents an approximation because overlapping 
terms, e.g. and he+i/+2, do not generally com- 

mute. This expansion therefore applies a sequence of two-site 
nearest-neighbor stochastic operators Se = exp(/!f £+i5f/2) 
sweeping across pairs of sites, as depicted in Fig. |3ja). A 
consequence of this expansion is that the approximation of 
exp(//f) remains stochastic and so preserves the Lj-norm and 
non-negativity of a probability vector \ P). In Appendix [C| we 
show that for stochastic systems, the corresponding Li-norm 
error incurred by using this expansion, which we call the Trot- 
ter error, scales at worst as Est ^ tbt^. 



B. Applying a two-site stochastic operator 

The core of the TEBD algorithm is a procedure for applying 
a two-site nearest-neighbor operator to an MPS and perform- 
ing a systematic truncation of the result back into an MPS of 



dimension % T\3\ \T4\ . Using this procedure then allows the 
sequence of operators Se making up the Trotterized approxi- 
mation to exp(//5f ) to be approximately applied to \P). Given 
an MPS of form |P> ^ liAl^l'iAPl'z • • -Al^l'-^li) the operator 
Se can be applied exactly to \P) and only affects the aI'1 and 
Al^'+i] tensors, as shown in step (i) of Fig.lJb). This forms a 
new order-4 tensor associated to these sites with the contrac- 
tion written out explicitly as 



d-l 



L (Si) 



't't+i 
ji'ji+i 



7 1' 



The resulting state Si\P), depicted in step (ii) of Fig. [3jb), 
no longer has a proper MPS form due to the presence of a 
structureless two-site tensor. To establish an MPS form we 
first reshape the tensor into a conventional matrix by com- 
bining the indices (/^,v) and {ie+i,iJ) into so-called fat indices 
which represent the row and columns respectively, as depicted 
in step (iii) of Fig.[3jb). The resulting matrix is then subject 
to a matrix factorization which approximately decomposes it 
into the form — UDV^ , where D is a R"^'' diagonal matrix. 
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U e R^^^jV € R'''^'*'' and % is the desired MPS dimension. 
This is step (iv) in Fig. [3];b) and is the most computationally 
expensive part of the procedure, requiring a number of com- 
putations 0{d^X^) for the SVD. After absorbing D into either 
the U 01 matrix according to the direction of the Trotter 
sweep, the matrices are reshaped into order-3 tensors A'l^' and 
^/[f+i] jjj lYds way, as shown in step (v) of Fig.jsjb), the fac- 
torization is used to impose internal structure on the original 
tensor and by replacing with the new tensors establishes 
a proper MPS for St\p). Crucially, however, the factoriza- 
tion also provides the means of truncating the MPS dimen- 
sion to some desired maximum %, controlling the growth of 
the description at the expense of becoming an approximation. 
We call the Li-norm error due to these approximate factoriza- 
tions the truncation error En- and we discuss the calculation of 
quantities which bound this error in Appendix [C] In Sec. (|ll| 
we have studied the analytic properties of the SVD and NMF 
factorizations, while now we focus on the numerical proper- 
ties of these factorizations and their accuracy when used in the 
two-site procedure described above. 



1. Non-negative matrix factorization 

There is no known algorithm for finding (non-trivial) ex- 
act NMF solutions for a given mx n non-negative matrix 0, 
despite their existence [591, and so an NMF based approxi- 
mation cannot currently proceed by truncation from an exact 
factorization. Instead an approximation © = WH^ is formed, 
where W and H are non-negative matrices, by solving directly 
the reduced rank minimization problem 



min {F{&,WH^)}. 



HeK 



where F{&,WH^) is a cost function that quantifies the quality 
of the approximation to 0. By forming diagonal matrices D\v 
and Dh containing the column sums of W and H any NMF can 
be brought into the form = UDV^ where D is a diagonal 
matrix D = DwDh, satisfying tr(D) — Lij0ij, and U and V 
Sire column stochastic |55 1. Common choices for F are 



F(0,W//^ 



Iy|0y-(H'i/Oljl 

Iij|0ij-(W//^)ij|, 



0ij + (W//^)ij}, 



(9) 

which are the L2-norm of the residual reshaped as a vector 
III®) - |0)||2, the generalized Kullback-Leibler (KL) diver- 
gence |72|, and the Li-norm || | 0) — 1 0) || i, respectively |60|. 
A crucial issue for any minimization algorithm based on these 
cost functions is that although they are convex for W and H 
separately they are not convex for W and H simultaneously, 
meaning the problem can have many local minima. A com- 
mon strategy for minimization is to alternate the optimization 
between W and H while the other is fixed, however such algo- 
rithms can, at best, only guarantee convergence to a local min- 
imum. Furthermore there is no guarantee that the global min- 
imum, if located, is unique. The non-uniqueness of an NMF 



is evident given that any R'^^" non-negative monomial matrix 
M can also form an alternative NMF as = WMM^^H^ . In 
Appendix [D] we outline some popular and simple NMF algo- 
rithms. From numerical test we have found, as expected, that 
the Li-norm minimizing NMF algorithms consistently deliver 
the smallest Li-norm errors and so we concentrate on this cost 
function. 



2. Singular value decomposition 

For any real m x n matrix there exists an SVD of the form 
= UDV'^, where U and V are R"^^ and R"^^ orthogo- 
nal matrices respectively, and X = min(m, n) is the maximum 
possible rank of B31 . In this context D is a R^^'''^ diagonal 
matrix of non-negative singular values {A,^} arranged in non- 
increasing order with /j, and the columns of U and V are the 
singular vectors. In contrast to the NMF, the SVD is a unique 
decomposition up to the signs of its left and right singular 
vectors. These properties, as well as the Eckart- Young theo- 
rem in Eq. (j7]i, not only make the SVD very mathematically 
appealing but have also aided in the construction of efficient, 
accurate and numerically robust algorithms for their computa- 
tion |45|. The L2-norm minimizing rank-^ approximation 
is formed by truncating the inner dimension in the SVD from 

to %. 



C. Optimality of factorizations within cTEBD 

For the calculation of arbitrary expectation values, we 
would like to minimize the distance between \P) and its 
cTEBD approximation | Q) . Within cTEBD it is the local 
tensor that is factorized, and discussions in the previous sec- 
tion explain how the algorithms will attempt to solve for the 
optimality of the approximation to the tensor Here, we 
examine the optimality of this factorization for the distance 
between | P) and | Q) . 

A unique property of the L2-norm is that it is unitarily in- 
variant and so we are free to evaluate it in any orthonormal 
basis. For this reason the SVD truncation of 0, which min- 
imizes the corresponding local error || 1 0) — 1 0) ||2, also mini- 
mizes the global error 1 1 1 P) — | 2) 1 1 2 between the correspond- 
ing vectors, so long as the indices of and correspond to 
an orthonormal basis. In AppendixjAjwe discuss the orthonor- 
mality constraints that the tensors {A 1^1} need to obey for this 
to be the case. In the context of classical systems, where inher- 
ently non-unitary local stochastic operators are applied [73], 
maintaining such orthonormal properties of the tensors {aI'I} 
is by no means guaranteed. Yet it turns out that by applying 
the two-site operators in a sweeping sequence across the sys- 
tem the relevant bases for each pair are always orthonormal 
and the SVD truncation is optimal for each operation |61|. 
This is described in detail in Fig.|4] The second-order Suzuki- 
Trotter expansion introduced earlier is based precisely on such 
a sweep and is therefore an ideal choice for implementing 
generic non-unitary evolution on an MPS if the L2-distance 
is the cost function to be minimized. 
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FIG. 4: (Color online) Following f611 we illustrate the orthonormal- 
ity structure of an MPS. Here the boxes correspond to the sets of 
basis states for the left and right subsystems for different bipartitions 
(see Appendix|A]for details). If these states form an orthonormal ba- 
sis then the box is shaded, and correspondingly it is hashed if they do 
not. (a) This is the situation obtained by following the procedure de- 
scribed in Fig.[3]up until 5/_i has just been applied and S( is next to 
be applied. All splittings to the left of the one between sites (i? — 1 , 
have left states which are orthonormal, while all those to the right 
have their right states orthonormal. The orthogonality centre is said 
to be located after site If the two-site stochastic operator 5^ is 

applied to sites I) then we can be assured that all the indices 

of the tensor are orthonormal since subsystems lI'^'I and r[*+'I 
have orthonormal bases, (b) This is the situation after the applica- 
tion of 5/. While the action of this non-unitary operator destroys 
the orthonormality of the unitarity of the SVD decomposition 

establishes it in lI^I. This results in the orthogonality centre mov- 
ing one site to the right. If we applied another two-site operator to 
sites (i* + I , £ + 2) then we are again assured of the orthonormality 
of the tensor indices since subsystems lW and have orthonor- 

mal bases in (b). Finally, analogous changes in the orthonormality 
structure occur for two-site operations moving to the left. 



For cTEBD based on sMPS and NMFs, orthogonality is 
less useful since in general it cannot be imposed on an sMPS 
due to its non-negativity, and moreover the Li-norm mini- 
mized by the NMF is not unitarily invariant. Despite the lack 
of optimality the Li-norm error of an NMF of the local ten- 
sor can be shown to upper-bound the full Li-norm error of 
the vectors. To see this we use the fact that the full Li-norm 
between the exact vector | P) and the approximate one | g) 
simplifies in matrix product form because the sMPS of the 
two states differ only for the two sites (£, £ + 1 ) on which an 
operator is applied, 

Ill^')-Ifi>lli 

^ 2^ . . .^['^-l]'>-i (0'V'/;+i _ 0'>'>+i )^l''+2]''c+2 . . .^M 



l'£'£+l 

< L I' 

W-IW+l '£''£+1 



.0'£'£+l I lc[f+21 ...^M 



/^'£'£+l 



c»'£'£+l 



Here we have used the non-negativity of tensors {A I*'} along 
with the fact that the left Ct^l • • • C^^- '1 and right Cl^'+^l • • • Ct'^l 
products, where C^^ — L,-^ A''^!'', form a row and column vec- 
tor, respectively, composed of non-negative elements < 1. 
The last line is then the Li-norm of the local tensor approxi- 



mation minimized by the NMF, reshaped as a vector. 



IV. THE TASEP 

Having introduced cTEBD, the rest of this paper deals with 
analyzing the performance of cTEBD on a test system, the 
TASEP, which we describe in Fig. [T] F or a detailed review 
of the TASEP we refer the reader to T39ll40ll . but for our pur- 
pose it is sufficient to note that the steady state of the sys- 
tem has been solved analytically [10], while numerical sim- 
ulations are needed to calculate arbitrary expectation values 
away from steady state. Like many of the particle hopping 
systems, the TASEP is modeled as a chain of lattice sites, 
each of which can be in one of d configurations, and thus can 
be described by the formalism introduced in Sec. (II Bi. For 
the TASEP d = 2 and the local configurations ii = 0, 1 corre- 
spond to empty and occupied sites respectively. As for a large 
class of stochastic systems with configurations connected by 
Poisson processes, the probabilities of being in each complete 
configuration \P\{t)] evolve according to a master equation of 
the form 



E(^i'(0^i' 



(10) 



where i^i-j-i' is the Poisson rate governing the transition from 
i to i' ifTTl . The first term on the right hand side gives the rate 
of transitions into configuration i and the second gives the rate 
of leaving it. 



We may rewrite such a master equation Eq. (10 1 as a 



stochastic Schrodinger equation Il37ll38]| of the form 



dt 



\P{t))^H\P{t)), 



where the stochastic Hamiltonian H is defined by 

{i\H\i')=Ri,^i for i^i', 
{i\H\i) = -Y,Ri^i'- 



(11a) 
(lib) 



i¥i 



The probability vector describing stochastic systems therefore 
undergoes non-unitary evolution via a Schrodinger equation in 
imaginary time with a non-Hermitian stochastic Hamiltonian. 
From Eqs. 



11 



a stochastic Hamiltonian has non-negative 
off-diagonal elements, a consequence of the non-negativity of 
transition rates, and non-positive diagonal elements to ensure 
the conservation of probability 

E(i'|//|i)=0. 
i' 

It has been shown in [62] that there always exists a station- 
ary state, an eigenvector of H corresponding to a zero eigen- 
value, and this vector is unique if H is strongly connected, i.e. 
all configurations are accessible, which is true of the TASEP. 
This implies ergodicity; after long times it will arrive at a dis- 
tribution independent of its initial state. The Hamiltonian for 
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FIG. 5 : (Color online) (a) The accuracy of approximations to the stationary state | P) of the 1 0-site TASEP with a = P = 1 . The Lj -norm error 
is plotted against x for the SVD (o) and Lj-norm minimizing NMF (<C>) performed at the central bipartition. For each SVD the Lpnorm error 
due to the sum of all negative elements is shown (x) along with the upper-bound to the Li-norm error (dashed line) derived from the exact 
L2-norm error, (b) Errors during the evolution of the 10-site TASEP with a = 0.25 and P = 0.75. The NMF and SVD algorithms are compared 
for X = 3 and 5/ = 10^^ where for each NMF four random restarts have been performed. Initially, the system was in the state describing a 
uniform distribution over all configurations. With decimation occurring at the central bipartition only, we plot the upper-bound to the Li-norm 
error incurred during each time-step, discussed in Appendix |C] using the SVD (smooth dashed line) and NMF (jagged solid line), and the 
actual Li-norm error for the SVD (□) and NMF (o) at time /. (c) The same as (b) but with decimation occurring at all contiguous bipartitions. 



the TASEP can be written as 



H = hi +hi, 



N-l 



(12) 



where the single-site terms hi and h^ describe the input and 
output of particles and the two-site nearest-neighbor terms 
{he.e+i} describe the hopping. The Hamiltonian terms can be 
calculated using Eqs. 



11 



and are given in 1441 . So evolution 
is generated by a stochastic operator of the form considered in 
Sec. ( jmAl i. 

For simplicity we take time t to be in units of the inverse 
of the hopping rate y^' and set 7= 1. Two observables that 
we will be particularly interested in are the density at site £, 
P(, which is equal to the local configuration /(>, and the current 
between site £ and £+1, J(, equal to 1 if in — l,i(+i — 0, and 
otherwise. 



V. SMALL SYSTEM ANALYSIS 

A. Comparison of the NMF and SVD for the TASEP 

A physically relevant comparison of the NMF and SVD 
can be made by considering the probability distiibution of the 
TASEP stationary state. The Li-norm errors for the SVD and 
NMF approximations to this probability distribution biparti- 
tioned at centre of the system are presented in Fig. |5|a) for 
a = p = 1 and = 10. This shows that the SVD is exact up 
to machine precision once X > 5 indicating that the stationary 
state has weak enough coiTelations to permit a considerable 
amount of compression. The NMF fails to identify an accu- 
rate solution as % is increased, due to issues with local min- 
ima, and gives an eiTor which levels off above 10^^. Note also 
that negativity of the approximation found from the SVD only 
occurs for X = 2 and is several orders of magnitude smaller 



than the overall Li-norm error Finally, the upper-bound to 
the Li-norm error for the SVD truncation, computed from the 
L2-norm error, is also plotted and seen to provide a reasonably 
tight upper-bound. 



To move beyond a single matrix factorization we have also 
tested how the NMF and SVD behave when used repeatedly 
to simulate time-evolution. We applied cTEBD to the TASEP 
model, in which the two sets of 5 sites on either side of the 
central bipartition of a 10-site system were merged so only 
this single bipartition was considered. A calculation of the 
time-evolution was performed up to a time t — N, using both 
factorizations with % = 3. At this level of truncation the SVD 
performed roughly the same as the NMF in the stationary 
state, as seen in Fig.|5ja). The results for this time-evolution 
are shown in Fig.|5jb). Initially, the system is in a x = 1 state 
but the NMF fails to find accurate decompositions at small 
times, while the SVD succeeds. At later times the difference 
in errors at each time-step reduce to an order of magnitude. To 



examine the optimality issue explained in Sec. ( III C I we have 
repeated the calculation, this time using a full 10-site matrix 
product decimation. We found that while both NMF and SVD 
errors are slightly worse when decimation occurs at each bi- 
paitition, this is not more serious for the NMF than the SVD 
algorithm, as demonstrated in Fig.|5jc). Rather than optimal- 
ity, the problem faced using current NMF algorithms is the 
plateau in accuracy seen for increasing %. This not only pre- 
vents any improvements, in contrast to the SVD, but causes 
unnecessary accumulation of error Also to support the use of 
the SVD, it is found that even with X = 3 the evolution never 
produces any negative probabilities over this time. These re- 
sults strongly suggest that the potential for negativity when 
using the SVD within the cTEBD framework does not present 
a serious obstacle. For this reason we shall, from this point 
on, focus on SVD-based decimation. 
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FIG. 6: (Color online) The errors in simulating the 15-site TASEP with a = 0.3 and P = 0.6, using cTEBD with 5f = 10~^, as compared to an 
exact simulation, (a) Using x = 5 we plot the distances between the approximate and exact probability vectors, 1 1 1 /') — | 2) 1 1 1 and 1 1 1 Z') — | (2) I b 
along with the errors 1(0)''- (0)2 1 for each observable O. The upper-bound "E^ is also plotted and of the solid lines this takes the largest 
values, with the Li and then L2-norm errors below, (b) A zoom in on the small time part of (a) along with the Lj-norm error due to the Trotter 
expansion !Est calculated using a x = 100 simulation, (c) The behavior of some of these distance measures with increasing x at f = A'. 



B. Behavior of cTEBD errors 

We now examine in detail the behavior of the different types 
of errors of S VD-based cTEBD for small systems, in prepara- 
tion for studying the behavior of the same errors for larger sys- 
tems in the next section. As a demonstration of the relation- 
ship between different error measures we show in Fig.|6|a) the 
Li and L2-norm errors of a simulation of the initially empty 
15-site TASEP over a time t = 2QN. Also in the figure we 
have plotted the expectation value errors of three observables: 
an observable r whose value at each configuration was ran- 
domly generated from a uniform distribution between and 
1; the density at the middle site pg; and the current leaving 
the middle site J^. The random observable scales with the L2- 
norm error, because it is uncorrelated with the MPS approxi- 
mation method, as expected from the discussion of observable 
error scaling in Appendix [b] The current scales slightly above 
this, and the density lies even closer to the Li-norm error, indi- 
cating that there is some coiTelation between cTEBD's errors 
and these two observables. The Li -distance lies close to the 
maximum factor of 2^^/^ above the L2 -distance and, as must 
be the case, all errors are bounded by the Li -distance. 

We may isolate the Trotter error by using a % large enough 
such that the truncation error is negligible. Also, it is al- 
ways possible to efficiently calculate "E^, introduced in Ap- 
pendix |C] which upper-bounds the cumulative Li-norm error 
due to truncation. The interplay between truncation and Trot- 
ter errors can be clearly seen in Fig. |6|b), which shows that 
the Li -distance follows the Trotter error, until a time at which 
truncation errors become more significant, as indicated by the 
increasing bound. "E^ is an overestimate of 'Etr and after times 
of a few it plateaus, as can be seen in Fig. |6|a). As it is 
cumulative, the final value of can be used to bound the 
Li -distance between the cTEBD and exact probability distri- 
butions, and so all errors, up to this time. 

Another consequence of being cumulative is that it has 
less relevance for much longer times, when the system ap- 



proaches steady state, especially since the eiTors do not in- 
crease monotonically in time. Consider the evolution of the 
TASEP shown in Fig. |6|a). Initially the system is in a spe- 
cific configuration, in this case the empty lattice, which can 
be exactly represented by an MPS with % — I and so there is 
zero eiTor. In time, correlations are introduced into the system 
and the transient eiTors grow to a peak located at f « A^, be- 
fore decreasing as the steady state is approached, suggesting 
that the steady state is particularly compressible. This shows 
that even though a low % may not be sufficient to describe 
the transient behavior of the system accurately, it could still 
give accurate results in the steady state. This is because the 
TASEP is ergodic ^62] and the Trotter approximation to the 
evolution operator in Eq. (j8]l appears to preserve this property. 
So through whichever states the low % approximation to the 
transient behavior takes the system, whether this be eiToneous 
or not, the approximation to the steady state will be the same. 

For small systems we can use cTEBD to calculate any ob- 
servable of the TASEP precisely, because by increasing %, 
while still severely compressing the system, we can restrict 
the Li-norm error to small values. This is shown, for a time 
t ^ N near the peak in errors, in Fig. |6|c). As % is increased, 
truncation errors shrink until the errors are dominated by the 
Trotter error. Even though both the Li and L2-norm errors 
seem to decrease monotonically with %, from Fig. |6|c) it is 
clear this is not necessarily the case for observable errors. 
This non-monotonic behavior arises because the correlation 
between errors in the probability vector and an observable's 
values may vary with %. 

The behavior of the errors in time, shown in Fig.|6|a), sug- 
gests that the steady state is more compressible than the tran- 
sient states. To confirm this we evolved the TASEP up to 
t = lOON by which time the system was effectively in the 
steady state. For this state, we show in Fig. [7|a) that errors 
can be driven down to the Trotter eiTor for % even as small as 
4 when using a time-step 5? — 10^'. To obtain more accu- 
rate results, 5f can be decreased, and the results presented in 
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FIG. 7: (Color online) The steady state of the TASEP. (a) For the 15-site TASEP with a = 0.3 and (3 = 0.6 we have plotted the convergence 
of error measures to the Trotter error as x is increased. This is shown for 8f = 10^' (o) and §f = 10^"* ((}). For both observables O we have 
plotted I (O)^ — {0)^\. (b) For the same system as (a) this shows the Trotter errors, calculated from the x = 20 simulations, plotted against 
5f on a log-log axis. To guide the eye, we have connected the points with lines. They show a 8t^ dependence, (c) The steady state singular 
value spectra of the 20-site TASEP, corresponding to bipartite splittings through the middle of the system. Above ;U = 1 1 the singular values 
are integer zero. 



Fig.[7|b) confirm that the scaling of Est is with 5f^. The suc- 
cess of the cTEBD algorithm in describing the TASEP steady 
states accurately with an MPS of small % stems from the rapid 
decay of the singular value spectra of the bipartite splittings 
of the probability distribution. In |48| an exact sMPS for ev- 
ery A^-site TASEP steady state was obtained with a dimension 
X = + 1 . From our discussion in Sec. ( II B i we know an ex- 



act SVD-MPS must exist of a dimension less than or equal to 
that of an exact sMPS, and this can be constructed explicitly 
by repeatedly performing SVDs on the matrices of the sMPS 
given in BSl . By doing this we have plotted, in Fig. |7|c), 
the spectra of singular values for the 20-site TASEP for a few 
values of a and p. The spectra decay super-exponentially be- 
fore becoming integer zero after /j = N /2 + 1. Numerical tests 
suggest that this is the matrix dimension needed for an exact 
MPS. How these properties extend to larger systems is inves- 
tigated in the next section. 



VI. SCALABILITY 

The calculation of the exact Li -distance is intractable for 
large systems. To overcome this we use three approaches 
to determine the accuracy of calculated expectation values: 
Firstly, in the steady state we compare expectation values cal- 
culated using cTEBD directly with exact analytical results; 
Secondly, away from steady state, we use to upper-bound 
the Li -distance; Finally, we show for specific observables that 
as X is increased each expectation value converges. 

With the first approach in mind, we simulated the TASEP 
on a 200-site lattice for a total time of lOOA^, enough to reach 
the steady state, which, from our analysis of small systems, is 
when we expect the most accurate results to be obtained. In 
Fig. [8|a) we have plotted a comparison between cTEBD and 
analytical expectation values for the density at a site close to 
the exiting boundary of the system. Even for such a large sys- 
tem the error is Trotter limited for a x of 7. This observable is 
actually unusual. For the vast majority of expectation values 



considered, e.g. current, density and arbitrary two-site corre- 
lations, the Trotter error was reached for I, signifying the 
extreme compressibility of these states, also responsible for 
the success of mean field theory on these systems. It is also 
worth noting that the non-monotonic behavior of errors with 
X, observed for smaller systems, is present for larger systems. 

To understand this compressibility of the steady states it 
is instructive to calculate the bound on the Li-norm A, in- 
troduced in Eq. (j6]), using the method for calculating singu- 
lar spectra from the exact stationary sMPS |48|, discussed in 



Sec. (VBi. This gives us an upper-bound to the Li -distance 
that may be achieved by an SVD-MPS constructed in the way 
described in Appendix [A] Although this construction is not 
exactly what is performed by cTEBD, the accuracy of both 
depend on the same properties. We have calculated A analyt- 
ically and plotted the results in[8|b) for several system sizes. 
Except for some super-exponential decay when x~A^/2+lit 
exhibits a nearly perfect exponential decay in x and exponen- 
tial increase in of the form A = a^b^'^, with a = l .0770 and 
b = II .7797 respectively. Hence the MPS dimension needed 
to describe the steady state with errors less than A is given 
byx=N log^(fl) — log^A, as plotted in Fig.jsj^c). We find that 
accurate MPS approximations to the steady states of large sys- 
tems can be obtained using a x of order 10, due to the rapid 
decay of the singular value spectra of the steady states, in line 
with the cTEBD results we have obtained. 

A potential limitation to the scalability of the cTEBD al- 
gorithm using the SVD is the exponential scaling with A^ of 
the prefactor bounding the Lpnorm via the L2-iiorm, which 
appears in Eq. (j6]). This indicates that in order to maintain 
a given bound on the Li-norm the simulation must be per- 
formed with exponentially increasing L2-norm accuracy with 
A^. The extremely rapid decay of the singular spectrum for 
TASEP stationary states indicates that an Li-norm bound can 
be maintained by only a very moderate increase in X- How- 
ever, we note that in order to exploit this property it may be 
necessary to use higher precision arithmetic than the standard 
64-bit double precision. Indeed, for even moderately large 
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FIG. 8: (Color online) (a) The convergence of the cTEBD density expectation value (pig?)" to the analytical result (P197) for A' = 200. The 
system is in the steady state for a = 0.349, p = 0.537, having used 5/ = 5 x lO^-'. (b) A plotted for different % and N, from the a = 0.3, 
P = 0.6 steady state, calculated using the exact results from |48 1. (c) For the same system as in (b) this shows the MPS dimension % needed to 
ensure an SVD-MPS exists with Li -distance less than A, as a function of A'. 




FIG. 9: (Color online) (a) The dependence of on A' and x, corresponding to the upper and lower x-axes respectively. We used a = 0.3 and 
p — 0.6, taking the initial state to be the empty lattice, and setting 8t = 5 x 10^^. Each simulation was ran up to f = lOA' by which time the 
value of had plateaued. (b) The evolution of the current expectation values, calculated with % = 100, for A' = 50, a = 0.25 and P = 0.25. 
The errors are bounded by 1^ which plateaus at 6 x 10^*. (c) The errors between the expectation values of the products of the densities 
of three sequential sites {O = PePt+iPc+l} and the % = 100 result, showing a convergence to within 10^^ as % is increased to 50. We used 
a = 0.25 and p = 0.25 and calculated expectation values att = N. The time-step used was 5? = 5 x 10^-^, and at f = the lattice was empty. 



systems, above 50 sites or so, all but the first few singular val- 
ues have a ratio to the largest that is below double machine 
precision. TEBD will not be able to reproduce these values. 
This is a similar problem to what was experienced in |[35l for 
the Reaction-Diffusion model, but in our case numerical in- 
stabilities do not arise from this imprecision. 

When away from the steady state, we cannot calculate the 
exact spectrum or expectation values. However, we can cal- 
culate "E^ and we have done this for a range of and % to 
produce Fig.|9|a). The results show that to ensure a small er- 
ror, a larger % is needed for larger A^. We can also deduce, for 
instance, that the error in calculating the expectation value of 
any observable for A^ = 50 using X = 75 will not exceed lO^'*. 
In light of this we have plotted in Fig. |9|b) the time-evolved 
current profile of an initially empty 50-site TASEP for a time 
t « 5A^ using a x of 100. The system fills up as particles in- 
jected into the first site hop across the system, and steady state 
values are almost reached by f = 5A^. From we can be sure 



that all values are correct to a factor of 10^^, despite evolving 
through the transient behavior where the largest errors are ex- 
pected to appear Note that in this argument we have ignored 
the Trotter errors, which we have found to much smaller than 
the truncation errors, and are easily controlled by reducing 5f . 

Importantly, is an upper-bound, and so it is sufficient, 
but not necessary for it to take small values. As we saw in 
Fig. |6|a) this upper-bound is often a gross overestimation of 
the actual Li-norm error as it is the cumulative sum of upper- 
bounds to the Li-norm error in each two-site operation, as de- 
tailed in Appendix[C] Motivated by this, we have used cTEBD 
to simulate the non-equilibrium dynamics of larger systems, 
for which alone could not guarantee accurate results. We 
considered how quickly expectation values converge with %, 
and calculated this convergence for systems at t — N, which 
in the investigation of small systems was found to be a time 
close to that with the largest error. As shown in Fig.|9|c), a nu- 
merically calculated expectation value for A^ = 100 converges 
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such that the % = 50 and 100 results differ only by a fraction 
of 10^^. So convergence occurs such that only a moderate % is 
needed to accurately calculate the expectation value of this ob- 
servable, even for transient states. This observable for which 
the results are plotted could, for example, be used to indicate 
the presence of a traffic jam with the TASEP used to model a 
single carriageway. Similar results were obtained for a range 
of observables, indicating that the cTEBD algorithm is able to 
accurately simulate the transient dynamics of the TASEP for 
these system sizes. 



VII. CONCLUSIONS 



as being extendable to network geometries with a bounded 
tree-width [63, 64|, it could prove to be a powerful simulation 
option. The accuracy of cTEBD shown here also indicates 
that there is great scope to adapt other more sophisticated 
MPS/tensor network methods to classical stochastic systems 
and extend simulations to higher dimensions. These include 
the use of matrix product operators to express the evolution 
for longer ranged processes |65 -68| and renormalization in- 
spired variational techniques, which have already been used 
with great success for both stationary and dynamical calcula- 
tions of quantum systems 1491 150| . Future work will investi- 
gate the performance of cTEBD and its extensions to a wider 
range of systems and geometries. 



In this paper we have investigated in detail the application 
of the TEBD algorithm to classical stochastic systems. A very 
reasonable approach to describe probability distributions is to 
use an sMPS |48| which is manifestly non-negative. We have 
shown that cTEBD can produce approximations to the dynam- 
ics of stochastic systems within the sMPS class by modifying 
only the factorization method, namely switching it to NMF. 
Unfortunately, current NMF algorithms find it difficult to lo- 
cate accurate low rank solutions, due to the non-convexity of 
the factorization. However, issues of non-optimality due to 
the non-orthogonality of the MPS do not seem to cause prob- 
lems, so if future algorithms overcome the former problem 
NMF-based cTEBD could become a practical option. The 
main approach explored here was to relax the constraint to be 
explicitly non-negative by using instead a general real MPS 
and applying the SVD method for the evolution in an essen- 
tially identical fashion to quantum problems. We showed that 
potential issues involving the proliferation of negativity and 
that the L2 rather than Li -distance is minimized turn out not 
to present serious obstacles to the SVD-based algorithm for 
the system sizes considered. 

With the TASEP as our test system, we demonstrated the 
accuracy and applicability of cTEBD for non-equilibrium sys- 
tems. We focused our investigation on the behavior of errors 
with truncation x and system size A^. An interesting feature 
was the lack of a monotonic dependence of expectation value 
errors on %, though in all cases considered, with system sizes 
up to the low hundreds, results converged. For the steady state, 
comparisons with analytically calculated expectation values 
were excellent and we obtained negligible errors for % less 
than 10. Away from steady state we introduced an upper- 
bound to the Li-distance, and thus all errors, which could be 
efficiently calculated as part of the algorithm when simulating 
any system, not just the TASEP. This was used to ensure that 
some of our 50-site simulations had negligible errors. Even 
for large systems for which this bound could not guarantee 
small errors, we found that expectation values converged for 
a large range of observables. 

This work suggests that cTEBD is a viable candidate 
for studying the time-evolution of a large class of non- 
equilibrium stochastic systems whose dynamics are governed 
by a stochastic Hamiltonian consisting of at most two-site 
nearest-neighbor terms. Since cTEBD is easily adjusted to 
include many rates with time and space dependence as well 
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Appendix A: Construction of an MPS via repeated SVDs 

In this section we follow Vidal f\3\ and show how through 
the repeated use of SVDs (Schmidt decompositions) an MPS 
of any vector \\\t) can, in principle, be found. To do this 
we first compute the Schmidt decomposition of | \|/) for ev- 
ery contiguous bipartition into blocks lI^I and rI^I, as depicted 



in Fig. 10 Starting from the left boundary, each left Schmidt 



state I L^) is iteratively expanded in terms of the local basis of 
the rightmost site of the block | ic) and the left Schmidt states 

J')} of the neighboring splitting one site further to the 
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FIG. 10: (Color online) The sequence of A' — 1 contiguous parti- 
tions (according to the labeling imposed) of the system in which the 
Schmidt decompositions are computed. 
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left. This gives the following set of expansions 

d-l 



|Ll?i) 



,1=0 
d-l Xi 



12=0 /Ji = l 



-^=1/ 



I I AEWlLltll)!/,), 



iv)=E E 

'iv=0/Jw-l = 



4M'A'|t[A'-1]\|: \ 



(Al) 



The expansion coefficients can be seen to define the tensors 
{A M } of an MPS representation of | \\f) by taking the last ex- 
pansion of 1 in Eq. (All and inserting all the others into it 



until no Schmidt states remain 



l¥> 




The term in the parentheses is identical to the matrix product 
expansion of the amplitudes \\ti given in Eq. ([TJ, but with all 
the summations written out explicitly. Truncating each sum- 
mation (equivalent to truncating the Schmidt decomposition 
or SVD) to X terms results in |\fA), an approximate MPS of 
dimension %. 

While the full product of matrices {aI^I'*} generates \ \\f), a 
partial product up to a site £ instead forms an MPS of a left 
Schmidt state | L^j ) as 

|LS)=E(A['l^.APfe...AM^O..|i)L. 

The orthonormality of the left Schmidt states results in all the 
tensors {aM} derived from this construction obeying a left 
orthonormality property 1181 



d-l 



E (aI^'I'^UI^'I''' 



(A2) 



!f=0 



where 1^^^ is the x Xe identity matrix. Notice that we could 
have equally performed this construction starting from the 
right boundary and using the right Schmidt states {|R^J)}. 
This would result in a different set of tensors {A 1^1} obeying a 
right orthonormality property 



re=0 



(A3) 



If we were to apply the left procedure up to and including site 
i and the right procedure up to an including site £+1 then the 
resulting MPS would be of the form 



I^) = £aW'iaP1'2...a[^1"'d[^1a 



(A4) 



where Dl^l is the diagonal matrix of singular values 
This MPS has an orthogonality centre, or twist in its handed- 
ness, located precisely at the bipartition after site £, since the 



constraints Eq. ( A2 1 and Eq. ( A3 1 are obeyed by the tensors to 
the left and right of D^, respectively. This is crucial since we 



can readily extract from the MPS in Eq. ( A4 1 an expansion of 
I \|/) in an orthonormal basis 



l¥) = E E ¥« 



'e'f+i I 



Ll')l'»l't'+i! 



by forming the two-site tensor about the orthogonality centre 



It is only for this pair of sites, or {£ — l,£) and + 1 , £ + 2) 
adjacent to the orthogonality centre, that a tensor of ampli- 
tudes for |\|/) can be formed from components of the MPS 
in Eq. \AA) , such that all indices correspond to orthonormal 
states. As discussed in Sec. ( |III C\ moving the orthogonality 
centre, so as to ensure that it is always located adjacent to any 
pair of sites on which a two-site operator is applied, is essen- 
tial for the optimality of the subsequent SVD decimation. 



Appendix B: Observable errors 

Here, we consider how the errors in calculating an observ- 
able using an approximate probability distribution, such as 
that given by a truncated MPS, should scale with the L\ and 
L2-distances. Let \ P) be the exact probability vector and | Q) 
be an approximate probability vector Now consider an ob- 
servable O taking a value (9, for each configuration i. Since 
the expectation value is calculated as {OY = Li OiPi, the dif- 
ference between the expected values of the observable, calcu- 
lated using the two probability vectors is then 



(o)^-(o)e=EOi(Pi-a), 



(Bl) 



The worst case scenario for this error is that the observable 
values and the values of the probability error for each config- 
uration are correlated such that each term in the sum is of the 
same sign. In this case the magnitude of the error can equal 
its upper-bound 

\{of-{o)Q\<l^\om-Qil 

<max{Oi}|||P)-|e)||i. 

So the error in calculating an expectation value of an observ- 
able is always bounded by the Li-norm error. However, for 
errors to scale as badly as this requires complete correlation 
between the approximation method and the observable. For 
many observables we expect no such correlation and it is in- 
structive to consider how observable errors typically scale in 
this case. This typical scaling would be revealed by calcu- 
lating the expected value of the magnitude of the observable 
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error Eq. (Bl i, averaging over all possible pairings of observ- 
able values Oi with errors Pi Q\. An approximation to this, 
which we can calculate analytically, is the average over all 
ways of replacing the set {(9i} by a sample of 2^ values from 
itself. This is equivalent to replacing each Oi by a random 
variable O which takes values from the original {Oi} with 
equal probability, and hence its mean and variance, E[0] and 
Var[(9], are the mean and variance of {Oi}. This replacement 
is typically a good approximation for observables like cur- 
rent, density and other correlations. Then, normalizing |2) 
such that = 1^ we make the substitution of Oi for O in 
Eq. (|BT]) 



and find its expected value and variance to be 



E[{Of-{O)Q]^0, 
Weiv[{Of-{0)Q] =Var[0] 



l^)-IG)lli 



It then follows that 



E[\{0f-{0)Q\]^SD[d]\\\P)-\Q)\\2, 

where SD means the standard deviation. This scales with 
the L2 rather than the Li-norm error, and we conclude that if 
there is no correlation between observable values and the ap- 
proximation method then typically the observable error will 
scale with the L2-iiorm error An example of this is shown in 
Fig. |6|a). Note that due to the possibility of negative proba- 
bilities the normalization Li 2i = 1 is not always the same as 

lllG)lli = i- 



Appendix C: The Trotter and truncation errors 

In approximating the full stochastic evolution exp(i/f ) by a 
product of n time-steps, each an identical stochastic evolution 
^ — il\e=i Se){I\l=!^_iSe), we quantify the error by the Li- 
norm of the residual 




<U^""-S) n S\P)\\i + \\e"''{e"''-S) n ^l^>lli 

j=n- 1 i=n-2 

• . . + II n e^''(e«^' - S)S\ P) II 1 + II n e^"(e^'' - 5)| P) || 1 , 
<n||e«S'-5||i. 

Now ||e^^' — ^^11 1 ^ 5f^, while n = t/5t so the error scales at 
worst as 'Est ^ tdt^. 

For convenience let's relabel the total set of two-site 
stochastic evolution operators over the whole simulation as 
SkSk- I - ■ - Sk - ■ ■ S2S1 1 P) where K = 2nN and the index k de- 
notes the position in the sequence. The second type of er- 
ror is the truncation error incurred when any two-site operator 
S is applied to some MPS | Q) . Rather than being applied 
exactly, the subsequent factorization and truncation back to 
an MPS with a maximum inner-dimension of % produces an 
MPS \Q')- This can be thought of as being equivalent to 



implementing exactly some other, generally non-stochastic, 
transformation T, which depends on k, \ Q) and %, such that 
I fi') = T\Q). The error in doing this is then quantified by 
the Li-norm e = ||(5 — r)| 2)||i. Once the sequence has 
been performed up to K the accumulated approximation is 
TkTk-i ■■■Ti\Q). For each step k it is possible to extract 
from the TEBD algorithm an upper-bound to the Li-norm er- 
ror = \ \{Sii — Tk) r<._ 1 • • • Ti I P) 1 1 1 caused by truncation after 
applying Sk to the accumulative approximate state up to point 
k — I. Specifically for the S VD, for each two-site operation 
we can calculate the L2-norm error by the Eckart Young the- 
orem Eq. (|7| and thus get an upper-bound to each Li-norm £k 
by 2^/^ times this value. This information then provides an 
upper-bound to the total accumulative Li-norm error due to 
truncation 1^^ after K two-site operations since 

nr=\\\USk-UTk]\P)\\u 
[k=l k=l ) 

K-l K-2 

< USk-Tk) n Tk\P)\\i + \\Sk{Sk-i-Tk-i) J! Tk\P)\\i 

k=l k=l 

■■■ + \\f{SkiS2-T2)n\P)\U + \\flSkiS,-T,)\P)\\u 

k=3 k=2 

<\\{SK-TK)YiTk\P)h + \ I {Sk- i-TK-i)Y\Tk\P)h 

k=\ k=\ 

••• + ||(52-r2)ri|p>||i + ||(5i-ri)|P)||i, 

K 

k=\ 

Thus the truncation error grows at worst additively dur- 
ing the Trotter sequence and can be monitored within the al- 
gorithm by computing the sum of the upper-bounds for each 
two-site operation. This gives us a conservative upper-bound 
"E^ > Eti-. 



Appendix D: Non-negative matrix factorization algorittims 

The lack of convexity and non-uniqueness of the NMF 
problem for the three most common cost functions in Eq. (|9]) 
has prompted a variety of algorithms to be proposed |69|. 
These algorithms tackle the minimization problem directly 
for the required rank % using alternating least-squares, con- 
jugate gradients, multiplicative updates or projected gradient 
descent ll47l l69l iTOll . Here we shall briefly mention some de- 
tails for the latter two approaches. The projected gradient de- 
scent method additively updates the elements of the W and H 
matrices as 



Wif, ^ Wi^+^w 



dF 



dF 
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where r|iy and r\H are appropriately chosen descent step-sizes. 
For the Li-norm the relevant derivatives are 



dF 



Under these transformations the KL divergence is non- 
increasing and is invariant if and only if W and H are at a 
stationary point |60| . 



where Ay = (0- is an element of the residual ma- 
trix. Under these update rules there is no explicit preservation 
of the non-negativity of W and H so typically a projection of 
the updated matrices is made to the positive orthant at each 
descent step by setting any negative elements to zero. Fur- 
thermore there is no preservation of the unit Li-norm of 
within the approximation WH^ so this too has to be explicitly 
enforced ll69l . 

Gradient descent can also be formulated for the L2-norm 
and the KL divergence. However, by making step sizes rjw 
and r[H depend on the matrix element being minimized via 
specially chosen functions of the current matrices W and H, 
so-called multiplicative algorithms can be derived ll47l l60l . 
For the KL divergence the updates to be applied are 



Wi,. 



0ii 



'/J i 



011 



Overall, we find that the above NMF algorithms are highly 
sensitive to the initial conditions. This means that to get rea- 
sonable results many random restarts are required and con- 
siderable fluctuations in the final cost function value are ob- 
served. They also often show slow convergence (if at all) and 
are generally much slower than the SVD. For the most rel- 
evant case of the Lpnorm cost function we have found that 
using a random initial matrix driven to a KL divergence solu- 
tion often provides a much higher quality result than simply 
applying the gradient descent to random initial matrices. De- 
spite this, however, we have found that, for our task, the SVD 
is superior to current NMF algorithms not only at moderate 
truncations but also at identifying when a near exact low-rank 
solution exists [74J- This is an essential ingredient of an ef- 
fective decimation algorithm. 
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